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Leaderless deterministic chemical reaction networks 

David Doty* Monir Hajiaghayi^ 



Abstract 



^^ I This paper answers an open question of Chen, Doty, and Solovcichik [5] , who showed that a 

^^ ■ function / : N'^ ^- N' is deterministicahy computable by a stochastic chemical reaction network 



(CRN) if and only if the graph of / is a semilinear subset of N'^"'"'. That construction crucially 
used "leaders" : the ability to start in an initial configuration with constant but non-zero counts 
of species other than the k species Xi, . . . ,Xk representing the input to the function /. The 
^^D ■ authors asked whether deterministic CRNs without a leader retain the same power. 

We answer this question affirmatively, showing that every semilinear function is detcrmin- 

istically computable by a CRN whose initial configuration contains only the input species 

rj \ Xi,...,Xfc, and zero counts of every other species. We show that this CRN completes in 

r \ • expected time 0{n), where n is the total number of input molecules. This time bound is slower 

• \ than the O(log^n) achieved in [5], but faster than the O(nlogn) achieved by the direct con- 

fi ■ struction of [5] (Theorem 4.1 in the latest online version of [5]), since the fast construction of 

that paper (Theorem 4.4) relied heavily on the use of a fast, error-prone CRN that computes 

arbitrary computable functions, and which crucially uses a leader. 



1 Introduction 



> 

in 

''^ ' In the last two decades, theoretical and experimental studies in molecular programming have shed 

^T . light on the problem of integrating logical computation with biological systems. One goal is to re- 

purpose the descriptive language of chemistry and physics, which describes how the natural world 
works, as a prescriptive language of programming, which prescribes how an artificially engineered 
system should work. When the programming goal is the manipulation of individual molecules in a 
well-mixed solution, the language of chemical reaction networks (CRNs) is an attractive choice. A 
CRN is a finite set of reactions such as X + Y ^ X + Z among abstract molecular species, each 
^_' describing a rule for transforming reactant molecules into product molecules. 

CRNs may model the "amount" of a species as a real number, namely its concentration (average 
count per unit volume), or as a nonnegative integer (total count in solution, requiring the total 
volume of the solution to be specified as part of the system). The latter integer counts model is 
called "stochastic" because reactions that discretely change the state of the system are assumed 
to happen probabilistically, with reactions whose reactants have high molecular counts more likely 
to happen first than reactions whose molecular counts are smaller. The computational power of 
CRNs has been investigated with regard to simulating boolean circuits [12], neural networks [10], 
digital signal processing [11], and simulating bounded-space Turing machines with an arbitrary 
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small, non-zero probability of error with only a polynomial slowdown [3]. CRNs are even efficiently 
Turing-universal, again with a small, nonzero probability of error over all time [13]. Certain CRN 
termination and producibility problems are undecidable [8,16], and others are PSPACE-hard [15]. 
It is also difficult to design a CRN to "delay" the production of a certain species [6,7]. Using a 
theoretical model of DNA strand displacement, it was shown that any CRN can be transformed 
into a set of DNA complexes that approximately emulate the CRN [4] . Therefore even hypothetical 
CRNs may one day be reliably implementable by real chemicals. 

While these papers focus on the stochastic behaviour of chemical kinetics, our focus is on 
CRNs with deterministic guarantees on their behavior. Some CRNs have the property that they 
deterministically progress to a correct state, no matter the order in which reactions occur. For 
example, the CRN with the reaction X — >• 2Y is guaranteed eventually to reach a state in which 
the count of Y is twice the initial count of X, i.e., computes the function f{x) = 2x, representing 
the input by species X and the output by species Y . Similarly, the reactions Xi — )• 2Y and 
X2 + Y —^ 0, under arbitrary choice of sequence of the two reactions, compute the function 
/(xi, X2) = max{0, 2x1 - X2}. 

Angluin, Aspnes and Eisenstat [2] investigated the computational behaviour of deterministic 
CRNs under a different name known as population protocols [1]. They showed that the input sets 
5 C N'^ decidable by deterministic CRNs (i.e. providing "yes" or "no" answers by the presence 
or absence of certain indicator species) are precisely the semilinear subsets of N .^ Chen, Doty, 
and Soloveichik [5] extended these results to function computation and showed that precisely the 
semilinear functions (functions / whose graph | (x, y) G pj^'+' | /(x) = y } is a semilinear set) 
are deterministically computable by CRNs. We say a function / : N — )• N is stably (a.k.a. 
deterministically) computable by a CRN C if there are "input" species Xi, . . . ^Xj. and "output" 
species li,...,!^ such that, if C starts with xi,...,Xfc copies of Xi,. . . ,Xk respectively, then 
with probability one, it reaches a count-stable configuration in which the counts oiYi, . . . ,Yi are 
expressed by the vector /(xi, ...,Xfc), and these counts never again change [5]. 

The method proposed in [5] uses some auxiliary "leader" species present initially, in addition to 
the input species. To illustrate their utility, suppose that we want to compute function /(x) = x + 1 
with CRNs. Using the previous approach, we have an input species X (with initial count x), an 
output species Y and an auxiliary "leader" species L (with initial count 1). The following reactions 
compute /(x): 

X ^Y 
L^Y 

However, it is experimentally difficult to prepare a solution with a single copy (or a small 
constant number) of a certain species. The authors of [5] asked whether it is possible to do away 
with the initial "leader" molecules, i.e., to require that the initial configuration contains initial 
count xi, X2, . . . , Xfc of input species Xi, X2, . . . , X^, and initial count of every other species. It is 
easy to "elect" a single leader molecule from an arbitrary initial number of copies using a reaction 
such as L + L ^- L, which eventually reduces the count of L to 1. However, the problem with this 
approach is that, since L is a reactant in other reactions, there is no way in general to prevent L 
from participating in these reactions until the reaction L + L ^ L has reduced it to a single copy. 



^Semilinear sets are defined formally in Section 2. Informally, they are finite unions of "periodic" sets, where the 
definition of "periodic" is extended in a natural way to multi-dimensional spaces such as N*". 



Despite these difficulties, we answer the question affirmatively, showing that each semilinear 
function can be computed by a "leaderless" CRN, i.e., a CRN whose initial configuration contains 
only the input species. To illustrate one idea used in our construction, consider the function 
f{x) = X + 1 described above. In order to compute the function without a leader (i.e., the initial 
configuration has x copies of X and copies of every other species), the following reactions suffice: 

X^B + 2Y (1.1) 

B + B ^ B + K (1.2) 

Y + K^0 (1.3) 

Reaction 1.1 produces x copies of B and 2x copies of Y . Reaction 1.2 consumes all copies of B 
except one, so reaction 1.2 executes precisely x — 1 times, producing x — 1 copies of K. Therefore 
reaction 1.3 consumes x — 1 copies of output species Y, eventually resulting in 2a; — (x — 1) = x + 1 
copies of Y. Note that this approach uses a sort of leader election on the B molecules. 

In Section 3, we generalize this example, describing a leaderless CRN construction to compute 
any semilinear function. We use a similar framework to the construction of [5], decomposing the 
semilinear function into a finite union of affine partial functions (linear functions with an offset; 
defined formally in Section 2). We show how to compute each affine function with leaderless CRNs, 
using a fundamentally different construction than the affine- function computing CRNs of [5]. This 
result, Lemma 3.1, is the primary technical contribution of this paper. Next, in order to decide 
which affine function should be applied to a given input, we employ the leaderless semilinear 
predicate computation of Angluin, Aspnes, and Eisenstat [3]; this latter part of the construction is 
actually identical to the construction of [5], but we include it because our time analysis is different. 

Let n = ||x|| = ||x||i = X]j=i x(i) be the number of molecules present initially, as well as 
the volume of the solution. The authors of [5] showed, for each semilinear function /, a direct 
construction of a CRN that computes / (using leaders) on input x in expected time O(nlogn). 
They then combined this direct, error-free construction in parallel with a fast (0(log n)) but error- 
prone CRN that uses a leader to compute any computable function (including semilinear), using 
the error-free computation to change the answer of the error-prone computation only if the latter 
is incorrect. This combination speeds up the computation from expected time 0{nlogn) for the 
direct construction to expected time 0(log n) for the combined construction. 

Since we assume no leaders may be supplied in the initial configuration, and since the problem 
of computing arbitrary computable functions without a leader remains a major open problem [3], 
this trick does not work for speeding up our construction. However, we show that with some care 
in the choice of reactions, the direct stable computation of a semilinear function can be done in 
expected time 0{n), improving upon the 0(n log n) bound of the direct construction of [5]. 

2 Preliminaries 

Given a vector x € N , let ||x|| = ||x||i = X],j=x |x(i)|, where x(i) denotes the ith. coordinate of x. 
A set A C N'' is linear if there exist vectors b, ui, . . . , Up G N*^ such that 

^4 = { b + niUi -I- . . . + UpUp I ni, . . . , TT-p G N } . 

A is semilinear if it is a finite union of linear sets. If / : N'^ — >■ N' is a function, define the graph of 
/ to be the set { (x, y) € N x N | /(x) = y | . A function is semilinear if its graph is semilinear. 



We say a partial function / : N'^ --->■ N' is afftne if there exist kl rational numbers oi^i, . . . , ak,i € 
Q and / + k nonnegative integers 61, . . . , 6;, ci, . . . , Cfc S N such that, if y = /(x), then for each 
j G {1, . . . , I}, y(j) = bj + X]i=i «i,j(x(i) - Ci), and for each i € {1,. . . ,k}, x(i) -a > 0. In 
matrix notation, there exist a k x I rational matrix A and vectors b G N and c G N such that 
/(x) = A(x - c) + b. 

This definition of affine function may appear contrived; see [5] for an explanation of its various 
intricacies. For reading this paper, the main utility of the definition is that it satisfies Lemma 3.2. 

Note that by appropriate integer arithmetic, a partial function / : N'^ --^ N' is affine if and only 
if there exist kl integers rei^i, . . . , n^,; G Z, and 2l+k nonnegative integers bi, . . . ,bi,ci, . . . ,Ck,di, . . . , 
d; G N such that, if y = /(x), then for each j G {1, . . . , /}, y(j) = bj + -^ J2i=i "'j,i(^(^) ~ ^0, and 
for each i G {1, . . . , k}, x(i) — Cj > 0. Each dj may be taken to be the least common multiple of the 
denominators of the rational coefficients in the original definition. We employ this latter definition, 
since it is more convenient for working with integer-valued molecular counts. 

2.1 Chemical reaction networks 

If A is a finite set (in this paper, of chemical species), we write N to denote the set of functions 
/ : A — 7- N. Equivalently, we view an element c G N as a vector of |A| nonnegative integers, with 
each coordinate "labeled" by an element of A. Given X G A and c G N , we refer to c{X) as the 
count of X in c. We write c < c' to denote that c{X) < c'{X) for all X £ A. Given c, c' G N^, 
we define the vector component-wise operations of addition c + c', subtraction c — c', and scalar 
multiplication nc for n G N. If A C A, we view a vector c G N equivalently as a vector c G N by 
assuming c{X) = for all X G A \ A. 

Given a finite set of chemical species A, a reaction over A is a triple a = (r, p, k) G N x N x M"*", 
specifying the stoichiometry of the reactants and products, respectively, and the rate constant k. If 
not specified, assume that k = 1 (this is the case for all reactions in this paper), so that the reaction 
a = (r,p, 1) is also represented by the pair (r,p) . For instance, given A = {A,B,C}, the reaction 
A + 2B -^ A + 3C is the pair ((1,2,0), (1,0,3)) . A (finite) chemical reaction network (CRN) is a 
pair C = (A, R), where A is a finite set of chemical species, and R is a finite set of reactions over A. 
A configuration of a CRN C = (A, i?) is a vector c G N . We also write #cX to denote c{X), the 
count of species X in configuration c, or simply ^X when c is clear from context. 

Given a configuration c and reaction a = (r,p), we say that a is applicable to c if r < c (i.e., 
c contains enough of each of the reactants for the reaction to occur). If a is applicable to c, then 
write a{c) to denote the configuration c + p — r (i.e., the configuration that results from applying 
reaction a to c). If c' = a(c) for some reaction q G .R, we write c -^c c', or merely c ^ c' when 
C is clear from context. An execution (a.k.a., execution sequence) £" is a finite or infinite sequence 
of one or more configurations £ = (co,Ci,C2, . . .) such that, for all i G {1, . . . ,\S\ — 1}, Cj„i — )• Cj. 
If a finite execution sequence starts with c and ends with c', we write c -^^ c', or merely c -^* c' 
when the CRN C is clear from context. In this case, we say that c' is reachable from c. 

Turing machines, for example, have different semantic interpretations depending on the compu- 
tational task under study (deciding a language, computing a function, etc.). Similarly, in this paper 
we use CRNs to decide subsets of N (for which we reserve the term "chemical reaction decider^^ 
or CRD) and to compute functions / : N'^ — >■ N' (for which we reserve the term "chemical reaction 
computer" or CRC). In the next two subsections we define two semantic interpretations of CRNs 
that correspond to these two tasks. We use the term CRN to refer to either a CRD or CRC when 



the statement is applicable to either type. 

These definitions differ slightly from those of [5], because ours are specialized to "leaderless" 
CRNs: those that can compute a predicate or function in which no species are present in the initial 
configuration other than the input species. In the terminology of [5], a CRN with species set A and 
input species set S is leaderless if it has an initial context cr : A \ S — )• N such that cr(S) = for all 
iS* € A \ S. The definitions below are simplified by assuming this to be true of all CRNs. 

We also use the convention of Angluin, Aspnes, and Eisenstat [2] that for a CRD, all species 
"vote" yes or no, rather than only a subset of species as in [5], since this convention is convenient 
for proving time bounds. 

2.2 Stable decidability of predicates 

We now review the definition of stable decidability of predicates introduced by Angluin, Aspnes, 
and Eisenstat [2].^ Intuitively, the set of species is partitioned into two sets: those that "vote" yes 
and those that vote no, and the system stabilizes to an output when a consensus vote is reached (all 
positive-count species have the same vote) that can no longer be changed (no species voting the other 
way can ever again be produced). It would be too strong to characterize deterministic correctness 
by requiring all possible executions to achieve the correct answer; for example, a reversible reaction 
such as A ^=^B could simply be chosen to run back and forth forever, starving any other reactions. 
In the more refined definition that follows, the determinism of the system is captured in that it is 
impossible to stabilize to an incorrect answer, and the correct stable output is always reachable. 

A (leaderless) chemical reaction decider (CRD) is a tuple T> = (A, i2, S,T), where {A, R) is a 
CRN, S C A is the set of input species, and T C A is the set of yes voters, with species in A \ T 
referred to as no voters. An input to V will be an initial configuration i € N (equivalently, i G N 
if we write S = {Xi, . . . ,Xk} and assign Xi to represent the i'th coordinate); that is, only input 
species are allowed to be non-zero. If we are discussing a CRN understood from context to have a 
certain initial configuration i, we write H^qX to denote i{X). 

We define a global output partial function $ : N ---> {0, 1} as follows, ^{c) is undefined if 
either c = 0, or if there exist 5*0 G A\T and Si £ T such that c{Sq) > and c{Si) > 0. Otherwise, 
either (V5 € A)(c(5) > =^ 5 G T) or (V5 G A)(c(5) > =^ 5 G A \ T); in the former case, 
the output $(c) of configuration c is 1, and in the latter case, $(c) = 0. 

A configuration o is output stable if ^{o) is defined and, for all c such that o — )•* c, ^{c) = ^{o). 
We say a CRD T> stably decides the predicate t(j : N — >■ {0, 1} if, for any initial configuration i G N , 
for all configurations c G N , i — )•* c implies c — >* o such that o is output stable and '^(o) = il^{i). 
Note that this condition implies that no incorrect output stable configuration is reachable from i. 
We say that T> stably decides a set ^ G N'^ if it stably decides its indicator function. 

The following theorem is due to Angluin, Aspnes, and Eisenstat [2]: 

Theorem 2.1 ( [2]). A set j4 C N*^ is stably decidable by a CRD if and only if it is semilinear. 

The model they use is defined in a slightly different way; the differences (and those differences' 
lack of significance to the questions we explore) are explained in [5] . 



^Those authors use the term "stably compute" , but we reserve the term "compute" to apply to the coinputation 
of non-Boolean functions. Also, we omit discussion of the definition of stable computation used in the population 
protocols literature, which employs a notion of "fair" executions; the definitions are proven equivalent in [5]. 



2.3 Stable computation of functions 

We now define a notion of stable computation of functions similar to those above for predicates. 
Intuitively, the inputs to the function are the initial counts of input species Xi, . . . , X^ , and the 
outputs are the counts of output species Yi, . . . ,y/. The system stabilizes to an output when the 
counts of the output species can no longer change. Again determinism is captured in that it is 
impossible to stabilize to an incorrect answer and the correct stable output is always reachable. 

A (leaderless) chemical reaction computer (CRC) is a tuple C = (A,i?, S,r), where (A, i?) is a 
CRN, S C A is the set of input species, F C A is the set of output species, such that S n F = 0. 
By convention, we let S = {Xi, ^2, . . . , X^} and F = {Yi, I2; • • • , Y/}. We say that a configuration 
o is output stable if, for every c such that o — ?>* c and every 1^ € F, o{Yi) = ciYi) (i.e., the counts 
of species in F will never change if o is reached). As with CRD's, we require initial configurations 
i G N in which only input species are allowed to be positive. We say that C stably computes a 
function / : N'^ ^^ N' if for any initial configuration i S N^, i — )>* c implies c -^* o such that o is an 
output stable configuration with /(i) = {o{Yi) , o{Y2) ■, ■ ■ ■ )0(^))- Note that this condition implies 
that no incorrect output stable configuration is reachable from i. 

If a CRN stably decides a predicate or stably computes a function, we say the CRN is stable 
(a.k.a. deterministic). 

2.4 Kinetic model 

The following model of stochastic chemical kinetics is widely used in quantitative biology and 
other fields dealing with chemical reactions between species present in small counts [9]. It ascribes 
probabilities to execution sequences, and also defines the time of reactions, allowing us to study 
the computational complexity of the CRN computation in Section 3. 

In this paper, the rate constants of all reactions are 1, and we define the kinetic model with 
this assumption. The rate constants do not affect the definition of stable computation; they only 
affect the time analysis. Our time analyses remain asymptotically unaffected if the rate constants 
are changed (although the constants hidden in the big-O notation would change). A reaction is 
unimolecular if it has one reactant and bimolecular if it has two reactants. We use no higher-order 
reactions in this paper. 

The kinetics of a CRN is described by a continuous-time Markov process as follows. Given 
a fixed volume v € M"*" and current configuration c, the propensity of a unimolecular reaction 
a : X — 7> . . . in configuration c is p{c,a) = #cX. The propensity of a bimolecular reaction 
a : X + Y ^- . . ., where X ^ Y, is p{c, a) = — — ^^—. The propensity of a bimolecular reaction 

a : X + X ^ . . . is p{c,a) = 2 — ^^ ■ The propensity function determines the evolution of 

the system as follows. The time until the next reaction occurs is an exponential random variable 
with rate p{c) = J2aeRP(.^^^) i^ote that p(c) = if no reactions are applicable to c). 

The kinetic model is based on the physical assumption of well-mixedness valid in a dilute so- 
lution. Thus, we assume the finite density constraint, which stipulates that a volume required to 
execute a CRN must be proportional to the maximum molecular count obtained during execu- 
tion [14]. In other words, the total concentration (molecular count per volume) is bounded. This 
realistically constrains the speed of the computation achievable by CRNs. Note, however, that it 
is problematic to define the kinetic model for CRNs in which the reachable configuration space 
is unbounded for some start configurations, because this means that arbitrarily large molecular 



counts are reachable.^ We apply the kinetic model only to CRNs with configuration spaces that 
are bounded for each start configuration, choosing the volume to be equal to the reachable con- 
figuration with the highest molecular count (in this paper, this will always be within a constant 
multiplicative factor of the number of input molecules). 

It is not difficult to show that if a CRN is stable and has a finite reachable configuration 
space from any initial configuration i, then under the kinetic model (in fact, for any choice of rate 
constants), with probability 1 the CRN will eventually reach an output stable configuration. 

We require the following lemmas, which are proven in Appendix A. 

Lemma 2.2. Let A = {Ai, . . . ,^m} be a set of species with the property that they appear only 
in applicable reactions of the form Ai — >■ ^^ Bi, where Bi A. Then starting from a configuration 
c in which for all i G {1, . . . , m}, #c^j = 0{n), with volume 0{n), the expected time to reach a 
configuration in which none of the described reactions can occur is O(logn). 

Lemma 2.3. Let A = {Ai, . . . , Am} be a set of species with the property that they appear only in 
applicable reactions of the form Ai + Aj -^ Ak + ^^ Bi, where Bi ^ A, and for all i, j € {1, . . . , m}, 
there is at least one reaction Ai + Aj —> .... Then starting from a configuration c in which for all 
i £ {!,..., TTi}, #c^i = 0{n), with volume 0{n), the expected time to reach a configuration in 
which none of the described reactions can occur is 0{n). 

Lemma 2.4. Let C = {Ci, . . . , Cm} and A = {Ai, . . . , Ap} be two sets of species with the property 
that they appear only in applicable reactions of the form Ci+Aj — t- Cj + X^j Bi, where Bi ^ A. Then 
starting from a configuration c in which for all i € {1, . . . , m}, H^cAi = 0{n) and ij^oCi = Q,{n), with 
volume 0{n), the expected time to reach a configuration in which none of the described reactions 
can occur is 0(log?i). 

3 Leaderless CRCs can compute semilinear functions 

To supply an input vector x S N to a CRN, we use an initial configuration with x(z) molecules of 
input species Xj. Throughout this section, we let n = ||x||i = X^j=ix(i) denote the initial number 
of molecules in solution. Since all CRNs we employ have the property that they produce at most 
a constant multiplicative factor more molecules than are initially present, this implies that the 
volume required to satisfy the finite density constraint is 0{n). 

Suppose the CRC C stably computes a function / : N^ — ^ N'. We say that C stably computes 
/ monotonically if its output species are not consumed in any reaction.^ 

We show in Lemma 3.1 that affine partial functions can be computed in expected time 0{n) 
by a leaderless CRC. For its use in proving Theorem 3.4, we require that the output molecules 
be produced monotonically. This is impossible for general affine partial functions. For example, 
consider the function f{xi,X2) = xi — 3:2 where dom / = { (2:1, X2) | 2:1 > 2:2 }. By withholding a 
single copy of X2 and letting the CRC stabilize to the output value #y = xi — X2 + 1, then allowing 
the extra copy of X2 to interact, the only way to stabilize to the correct output value xi — X2 is to 
consume a copy of the output species Y. Therefore Lemma 3.1 is stated in terms of an encoding 
of affine partial functions that allows monotonic production of outputs, encoding the output value 



^One possibility is to have a "dynamically" growing volume as in [14]. 

*Its output species could potentially be reactants so long as they are catalytic, meaning that the stoichiometry of 
the species as a product is at least as great as its stoichiometry as a reactant, e.g. X + Y ^- Z + Y or A + Y ^ Y + Y . 



y(j) as the difference between the counts of two nionotonically produced species Y^ and Y^ , a 
concept formahzed by the foHowing definition. 

Let / : N''" — ->■ N' be a partial function. We say that a partial function / : N'^ --^ N' x N' is 
a diff-representation of / if dom / = dom / and, for all x G dom /, if (yp,yc) = /(x), where 
yp^yc £ I^'i then /(x) = yp — yc, and yp = 0(/(x)). In other words, / represents / as the 
difference of its two outputs yp and yc, with the larger output yp possibly being larger than the 
original function's output, but at most a multiplicative constant larger. 

The following lemma is the main technical result required for proving our main theorem, The- 
orem 3.4. It shows that every affine function can be computed (via a diff-representation) in time 
0(n) by a leader less CRC. 

Lemma 3.1. Let f : N --■>■ N be an affine partial function. Then there is a diff-representation 
f : N'"' --•»■ N' X N' of f and a leaderless CRC that nionotonically stably computes f in expected 
time 0{n). 

Proof. If / is affine, then there exist kl integers rii,i, . . . ,nk^i G Z and 21 + k nonnegative integers 
6i,... ,bi,ci,... ,Ck,di,...,di £ N such that, if y = /(x), then for each j G {!,... ,/}, y{j) = 
^j + i" Si=i ^ij(^(^) ~ Cj), and for each i G {1, . . . , k}, x(i) — Cj > 0. Define the CRC as follows. 
It has input species E = {Xi , . . . , Xk} and output species T = {Y-f , . . . , 1^^, Yf, . . . , Yf'}. 

There are three main components of the CRN, separately handling the Cj offset, the Uij/dj 
coefficient, and the bj offset. 

The latter two components both make use of Y,- molecules to account for production of Y- 
molecules in excess of y(j) to ensure that #ooY^ — #ooYj^' = y(j), which establishes that the 
CRC stably computes a diff-representation of /. It is clear by inspection of the reactions that 
#ooYf = 0(y(j)). 

Add the reaction 

Xi -^ Ci,i + B1 + B2 + ... + B1 + biYf + 62^2^ + . . . biY^P (3.1) 

The first product Ci^i will be used to handle the ci offset, and the remaining products will be used 
to handle the bj offsets. For each i G {2, . . . , A:}, add the reaction 

Xi -^ Q,i (3.2) 

By Lemma 2.2, reactions (3.1) and (3.2) take time O(logn) to complete. 
We now describe the three components of the CRC separately. 

Cj offset: Reactions (3.1) and (3.2) produce x(i) copies of Cj^i. We must reduce this number by q, 
producing x(i) — Cj copies of X^', the species that will be used by the next component to handle 
the Uij/dj coefficient. A high-order reaction implementing this is (q -|- 1)Cj,i — >■ qCj^i -|- X'-, 
since that reaction will eventually happen exactly x{i)—Ci times (stopping when #Ci,i reaches 
Ci). This is implemented by the following bimolecular reactions. 

For each i G {1, . . . , /c} and m,p £ {1, . . . , q}, if m + p < Ci, add the reaction 

If m, + p > Ci, add the reaction 

Ci,m + Ci^p -^ Ci^a + {m + p- Ci)X[. 
By Lemma 2.3, these reactions complete in expected time 0{n). 



riij/dj coefficient: For each i G {1, . . . , /c}, add the reaction 

^i ~^ ^i,l + ^i,2 + • • • + Xi^i 

This allows each output to be associated with its own copy of the input. By Lemma 2.2, 
these reactions complete in expected time O(logn). 

For each i G {1, . . . ,k} and j € {1, . . . , /}, add the reaction 



'''^^ i-n,,,)Df^„ ifn,,, 



By Lemma 2.2, these reactions complete in expected time O(logn). 

We must now divide i^D^^ and i^D^^ by dj. This is accomplished by the high-order reactions 
djDj^ — )> Y- and djD^-^ -^ Y,- . Similarly to the previous component, we implement these 
with the following reactions for dj > 1. 

We first handle the case dj > 1. For each j E {!,...,/} and m,p G {1, . . . ,dj — 1}, if 
m + p < dj — 1, add the reactions 

If 771 + p > Cj , add the reactions 



tC _|_ r)C _, j~)C _|_ -taC 

j,m. 1^ j,p j,m+p—dj '^ j 



By Lemma 2.3, these reactions complete in expected time 0{n). 
When dj = 1, we only have the following unimolecular reactions. 

^& ^ ^"^ 

By Lemma 2.2, these reactions complete in expected time O(logn). 

These reactions will produce j- ^^. ,^q njj(x(i) — q) copies ofY^ and —j- J2n <o ^«j(^(^)~ 
Cj) copies of y'^. Therefore, letting i^codYf and #cociYf denote the number of copies of 
Y^ and Y^ eventually produced just by this component, it holds that #cociYf — #coef^^ = 

i-ELi"M(x(«)-Ci). 

bj offset: For each j G {1, . . . , /}, add the reaction 

Bj + Bj -^ Bj + bjYf (3.3) 

By Lemma 2.3, these reactions complete in expected time 0{n). 

Reaction (3.1) produces bj copies of Y^ for each copy of Bj produced, which is x(z). Reac- 
tion (3.3) occurs precisely x(i) — 1 times. Therefore reaction (3.3) produces precisely bj fewer 
copies of Y^ than reaction (3.1) produces of Y- . This implies that when all copies of Y^ are 
eventually produced by reaction (3.3), the number of K^'s produced by reaction (3.1) minus 
the number of Y,- 's produced by reaction (3.3) is bj. D 



We require the following lemma, proven in [5]. 

Lemma 3.2 ( [5]). Let / : N — > N be a semilinear function. Then there is a finite set {/i : N ---> 
N', . . . , /m : N^ ---> N'} of affine partial functions, where each dom /j is a linear set, such that, for 
each X € N , if fi(x.) is defined, then /(x) = /i(x), and IJilLi dom /j = N . 

We require the following theorem, due to Angluin, Aspnes, and Eisenstat [3, Theorem 5], which 
states that any semilinear predicate can be decided by a CRD in expected time 0{n). 

Theorem 3.3 ( [3]). Let cj) :'N ^ {0, 1} be a semilinear predicate. Then there is a leaderless CRD 
D that stably decides (j), and the expected time to reach an output-stable configuration is 0{n). 

The following is the main theorem of this paper. It shows that semilinear functions can be 
computed by leaderless CRCs in linear expected time. 

Theorem 3.4. Let / : N ^^ N be a semilinear function. Then there is a leaderless CRC that 
stably computes f in expected time 0{n). 

Proof. The CRC will have input species S = {Xi, . . . ,Xk} and output species F = {Yi, . . . ,Yi}. 
By Lemma 3.2, there is a finite set F = {fi : N'' —^ W,...,fm : N'^' — ^ N'} of affine partial 
functions, where each dom fi is a linear set, such that, for each x G N'^', if /i(x) is defined, then 
/(x) = /j(x). We compute / on input x as follows. Since each dom fi is a linear (and therefore 
semilinear) set, by Theorem 3.3 we compute each semilinear predicate (j)i = "x G dom fi and 
(Vi' € {1, . . . ,i — 1}) X dom /j'?" by separate parallel CRD's each stabilizing in expected time 
0{n). (The latter condition ensures that for each x, precisely one of the predicates is true, in case 
the domains of the partial functions have nonempty intersection.) 

By Lemma 3.1, for each i G {!,..., m}, there is a diff-representation fi of fi that can be stably 
computed by parallel CRCs. Assume that for each i G {1, . . . , m\ and each j G {1, ...,/}, the 
jth pair of outputs yp{j) and yc(j) of the ith. function is represented by species 1^^ and YJj. We 
interpret each Y^j and Y^j as an "inactive" version of "active" output species Y^^j and Y/<. 

For each i G {!,... ,m}, for the CRD Vi = (A, i?, E,T) computing the predicate (j)i, let Lj 
represent any species in T, and L^ represent any species in A \ T, and that once Vi reaches an 
output stable configuration, #L^ = f!.{n), where b is the output of T>i. Then add the following 
reactions for each i G {1, . . . , m} and each j G {1, ...,/}: 

Lj + Y^^ ^ L} + Y^ + Yj (3.4) 

L° + y,^ ^ L^ + M^j (3.5) 

M,,,+Yj ^ y,^ (3.6) 

The latter two reactions implement the reverse direction of the first reaction - using L^ as a catalyst 
instead of Lj - using only bimolecular reactions. Also add the reactions 

L}+Y^^ ^ Lj+Yi"^^ (3.7) 

L° + y,5 ^ L'^ + Yi'^j (3.8) 



and 



Yi^ + Y^ ^ Kj (3.9) 

Kj + Yj -^ (3.10) 
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That is, a "yes" answer for function i activates the ith output and a "no" answer deactivates 
the ith output. Eventually each CRD stabilizes so that precisely one i has Lj present, and for all 
i' ^ i, L^i is present. We now claim that at this point, all outputs for the correct function /j will 
be activated and all other outputs will be deactivated. The reactions enforce that at any time, 
Wj = #Kj + ET=ii#Kj + #^'kj)- In particular, #Yj > #Kj and #Yj > #Mij at all times, 
so there will never be a Kj or Mij molecule that cannot participate in the reaction of which it is 
a reactant. Eventually i^Y-^- and i^Yf"^ stabilize to for all but one value of i (by reactions (3.5), 
(3.6), (3.8)), and for this value of i, ifY^^- stabilizes to y(j) and ifYp- stabilizes to (by reaction 
(3.9)). Eventually #i^j stabilizes to by the last reaction. Eventually #Mjj stabilizes to since 
L? is absent for the correct function /j. This ensures that ^Yj stabilizes to y(j). 

It remains to analyze the expected time to stabilization. Let n = ||x||. By Lemma 3.1, the 
expected time for each afiine function computation to complete is 0{n). Since the Y^- are pro- 
duced monotonically, the most Yf- molecules that are ever produced is #ooY/]- Since we have m 
computations in parallel, the expected time for all of them to complete is 0{nm) = 0{n) (since m 
depends on / but not n). We must also wait for each predicate computation to complete. By The- 
orem 3.3, each of these predicates takes expected time 0{n) to complete, so all of them complete 
in expected time 0{mn) = 0{n). 

At this point, the L\ leaders must convert inactive output species to active, and Lq (for i' ^ i) 
must convert active output species to inactive. By Lemma 2.4, reactions (3.4), (3.5), (3.7), and (3.8) 
complete in expected time 0(log n). Once this is completed, by Lemma 2.3, reaction (3.6) completes 
in expected time 0{n). Reaction (3.9) completes in expected time 0{n) by Lemma 2.3. Once this 
is completed, reaction (3.10) completes in expected time 0{n) by Lemma 2.3. D 

4 Conclusion 

The clearest shortcoming of our leaderless CRC, compared to the leader-employing CRC of [5], 
is the time complexity. Our CRC takes expected time 0{n) to complete with n input molecules, 
versus 0(log n) for the CRC of [5]. The major open question is, for each semilinear function 
/ : N*^ — >■ N', is there a leaderless CRC that stably computes / on input of size n in expected 
time t{n), where i is a sublinear function? This may relate to the question of whether there is a 
sublinear time CRN that solves the leader election problem, i.e., in volume n with an initial state 
with n copies of species X and no other species initially present, produce a single copy of a species 
L. However, it is conceivable that there is a direct way to compute semilinear functions quickly 
without needing to use a leader election. 

If this is not possible for all semilinear functions, another interesting open question is to precisely 
characterize the class of functions that can be stably computed by a leaderless CRC in polylog- 
arithmic time. For example, the class of linear functions with positive integer coefhcients (e.g., 
f{xi,X2) = 3xi -|- 2x2) has this property since they are computable by 0(logn)-time unimolecular 
reactions such as Xi — )■ 3Y,X2 -^ 2Y. However, most of the CRN programming techniques used to 
generalize beyond such functions seem to require some bimolecular reaction A + B ^>- . . . in which 
it is possible to have ^A = ^B = 1, making the expected time at least n just for this reaction. 

Acknowledgement. We are indebted to Anne Condon for helpful discussions and suggestions. 
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A Appendix 

In this appendix, we prove some lemmas about the time complexity of certain common sequences 
of reactions. Some of these are implicit or explicit in many earlier papers on stochastic CRNs, but 
we include them for the sake of self-containment. 

The lemmas are stated with respect to a certain "initial configuration" c that may not be the 
initial configuration of an actual CRN we define. This is because the lemmas are employed to argue 
about CRNs that are guaranteed to evolve to some configuration c that satisfies the hypothesis of 
the lemma, and we use the lemma to bound the time it takes for the CRN to complete a sequence of 
reactions, starting from c. Therefore terms such as "applicable reaction" refer to being applicable 
from c and any configuration reachable from it, although some additional inapplicable reactions 
may have been applicable prior to reaching the configuration c. 

Lemma 2.2. Let A = {Ai, . . . , Am} be a set of species with the property that they appear only 
in applicable reactions of the form Ai — ?> ^^ Bi, where Bi A. Then starting from a configuration 
c in which for all i € {1, . . . ,m}, i^cAi = 0{n), with volume 0{n), the expected time to reach a 
configuration in which none of the described reactions can occur is O(logn). 

Proof. Assume the hypothesis. Let c S N be the constant such that X]"!^ i^cAi < en. After 
each relevant reaction occurs, this sum is reduced by 1. Therefore no reactions can occur after 
en reactions have executed. If YlT^i H^-^i — ^; the expected time for any reaction to occur is ^. 
By linearity of expectation, the expected time for en reactions to execute is at most Yl'k^=i \ ~ 
O(logn). D 

Lemma 2.3. Let A = {^i, . . . , A^} be a set of species with the property that they appear only in 
applicable reactions of the form Ai + Aj -^ Ak + X^^ Bi, where Bi ^ A, and for alH, j G {!,..., m}, 
there is at least one reaction Ai + Aj — > .... Then starting from a configuration c in which for all 
i € {!,..., r/i}, i^cAi = 0{n), with volume 0{n), the expected time to reach a configuration in 
which none of the described reactions can occur is 0{n). 

Proof. Assume the hypothesis. Let c G N be a constant such that X^^^Li i^cAi < en, and let c' 
be a constant such that the volume is at most c'n. After each relevant reaction occurs, this sum 
is reduced by 1. Therefore no reactions can occur after en — 1 reactions have executed. Now let 
p(c, aij) be the propensity of the reaction Ai + Aj -^ Ak + J2i ^i which is equal to p{c, Oji) as well. 
Since Ai can react with Aj for any i,j G {1, . . . , m}, given that Xl^ii 7^^« — ^' the time for the 
next reaction to occur is an exponential random variable with rate equal to the sum of the rates of 
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each possible reaction, i.e., 
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^ ^ p(c, aij) = -^Y^ p{c, aij) + Y^ p(c, ai 
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so the expected time for the next reaction to occur is ,^ " . By hnearity of expectation, the expected 

time for cn — 1 reactions to execute is at most YlT=i W^ ~ '^''^Sfe=^ (aPT ~ \) ~ c'n{l — ^^^_i ) = 
0(?i). D 



Lemma 2.4. Let C = {Ci, . . . ,Cm} and A = {Ai, . . . ,Ap} be two sets of species with the 
property that they appear only in applicable reactions of the form d + Aj — )■ d + ^^ Bi, where 
Bi ^ A. Then starting from a configuration c in which for all i G {1, . . . ,m}, #c^i = 0{n) and 
#oC'i = ri(n), with volume 0{n), the expected time to reach a configuration in which none of the 
described reactions can occur is O(logn). 

Proof. Assume the hypothesis. Then the counts of each Ci do not decrease. (They may increase 
if some Bi £ C, but this only strengths the conclusion.) Therefore this is similar to the proof of 
Lemma 2.2, since the expected time of each reaction when Yl^=i H^o^j = k \s within a constant of 
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